Polymorphic repetitive sequences in chlamydiae and uses thereof

ABSTRACT

In general, the invention features a method for determining the presence of a strain of chlamydia in a biological sample. The method includes the steps of (a) providing a biological sample; and (b) determining the presence of a polynucleotide containing a polymorphic repetitive sequence in a polynucleotide in the sample, wherein the polymorphic repetitive sequence is associated with a first strain of chlamydia and not associated with a second strain of chlamydiae. In this method, the presence of the polynucleotide containing the polymorphic repetitive sequence indicates presence of the first strain of chlamydia.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. ProvisionalApplication No. 60/366,477, filed Mar. 21, 2002.

BACKGROUND OF THE INVENTION

[0002] The invention relates to the field of diagnosis and treatment ofinfectious diseases.

[0003] The chlamydiae are obligate intracellular pathogens that cause avariety of diseases in animal species at virtually every phylogeneticlevel. Of these, Chlamydia (C.) trachomatis and C. pneumoniae areconsidered the most significant human pathogens. C. trachomatis is theleading cause of preventable blindness worldwide and the most commonsexually transmitted bacterial species. C. pneumoniae causes 10% to 20%of community-acquired pneumonia worldwide and has recently beenassociated with coronary arteriosclerosis and multiple sclerosis. Thechlamydiae undergo a developmental cycle unique among prokaryotes. Theelementary body is infectious, but is metabolically inactive and cannotreplicate. This form differentiates upon infection into thenon-infectious reticulate body, a larger pleomorphic bacterium that ismetabolically active and multiplies. Following uptake, chlamydiaedevelop and grow within an intracellular vacuole, called an inclusion,where they will differentiate from the elementary body to the reticulatebody then to the elementary body.

[0004] Chlamydiae encode an abundant protein termed the major outermembrane protein (MOMP, or OmpA) that is surface exposed in C. psittaciand C. trachomatis and is the major determinant for serologicclassification of chlamydial isolates. This protein is highly variablewithin its exposed domains except in ruminant invasive C. psittaci,feline strains of C. psittaci and C. pneumoniae, where they areextremely conserved.

[0005] Completion of the sequences of five chlamydial genomes (one C.trachomatis, three C. pneumoniae and one C. muridarum) has revealed theimportance of a group of proteins unique to the chlamydiae, thepolymorphic membrane proteins (Pmps). These proteins had been shownpreviously to be antigenic in C. psittaci. The genes encoding for theseproteins belong to a complex family and span 13.6 and 17.5% of the C.trachomatis and C. pneumoniae genomes, respectively. There is aconsiderable expansion of these genes in C. pneumoniae; the C.trachomatis genome possesses 9 pmp genes (A to I) whereas the C.pneumoniae genome possesses 21 pmp genes. Pmps are characterized by tworepeated tetrapeptidic motifs, almost never found outside chlamydiae:GGA(L/V/I) (SEQ ID NO: 1) and FXXN (SEQ ID NO: 2). The non-chlamydialproteins exhibiting these motifs have been implicated in the adherenceto mammalian tissues. As the Pmps have been localized at the chlamydialcell surface, their role in adhesion, molecular transport, signaling, orsome other cell wall associated function is likely.

[0006] Prokaryotic genomes are compact, with sizes ranging from lessthan 600 kb in Mycoplasma to more than 10 Mb in several cyanobacterialand myxobacterial species. Chlamydial genomes range from 1 to 1.2 Mb.These compact genomes have likely been maintained through selectivepressure for rapid DNA replication and cell reproduction. Furthermore,the obligate intracellular way of life of the chlamydiae tends tominimize the length of the genome. It was therefore expected thatrepetitive sequences would be kept to a minimum under natural selectionfor rapid growth. Various classes of repetitive DNA elements have beenrecently discovered in many prokaryotes (Rocha et al., Mol. Biol. Evol.16:1219-1230, 1999). Such repetitive sequences can be the cause or thehallmark of the plasticity of the genome. Thus, bacteria could haveevolved mechanisms based on the presence of repeated sequences forincreasing the frequency of random variations in a specific subset ofgenes. Molecular mechanisms at the basis of this variation areessentially based on slipped-mispair of replicating strands for closerepeats and homologous recombination between long intra-chromosomalrepeats. These highly mutable loci, sometimes called ‘contingency’ loci,would be involved in critical interactions with the environment,allowing certain phenotypic traits to respond rapidly, by naturalselection, to unpredictable changes.

SUMMARY OF THE INVENTION

[0007] Using an in silico approach, we have examined repeats within thecomplete genomes of chlamydiae. This analysis focused on the search forrepeats of statistically significant length, taking into account thegenome size and composition. We then determined whether those repeatswere sites for sequence variation in vivo.

[0008] We discovered that the repeated sequences in different strains ofchlamydiae were polymorphic. The presence of a particular polymorphismcan thus be used to detect the presence of a particular strain bydetecting the presence of a polymorphic repeated sequence associatedwith that strain and not associated with other strains of chlamydiae.

[0009] Accordingly, in a first aspect, the invention features a methodfor determining the presence of a strain of chlamydia in a biologicalsample. The method includes the steps of (a) providing a biologicalsample; and (b) determining the presence of a polynucleotide containinga polymorphic repetitive sequence in a polynucleotide in the sample,wherein the polymorphic repetitive sequence is associated with onestrain of chlamydia and not associated with other strains of chlamydiae.In this method, the presence of the polynucleotide containing thepolymorphic repetitive sequence indicates the presence of that strain ofchlamydia.

[0010] In a second, related aspect, the invention features a method fordetermining the presence of a plurality of strains of chlamydiae in abiological sample. This method includes the steps of: (a) providing abiological sample; and (b) determining the presence in the biologicalsample of a plurality of polynucleotides, each containing a polymorphicrepetitive sequence, wherein each polymorphic repetitive sequence isassociated with one strain of chlamydia and not associated with otherstrains of chlamydiae. In this method, presence of a polymorphicrepetitive sequence indicates the presence of the strain of chlamydiaassociated with that polymorphic repetitive sequence, and the absence ofthat polymorphic repetitive sequence indicates absence of the associatedstrain of chlamydia.

[0011] In another aspect, the invention features a method for treating achlamydial infection in a patient. This method includes the steps of (a)providing a biological sample from the patient; (b) determining thepresence in the biological sample of a plurality of polynucleotides,each containing a polymorphic repetitive sequence, wherein eachpolymorphic repetitive sequence is associated with one strain ofchlamydia and not associated with other strains of chlamydiae; and (c)administering to the patient anti-chlamydial agents that are effectiveagainst the strains of chlamydiae that are present in the biologicalsample.

[0012] In any of the foregoing methods, the strain of chlamydia can be astrain of any chlamydial species (e.g., C. psittaci, C. trachomatis, C.pecorum, C. abortus, C. caviae, C. felis, C. suis, C. muridarum,Neochlamydia (N.) hartmannellae, Parachlamydia (P.) acanthamoebae,Simkania (S.) negevensis, and Waddlia (W.) chondrophila). In particularembodiments, the strain of chlamydia is C. pneumoniae strain CWL-029, C.pneumoniae strain AR 39, C. pneumoniae strain J138, or C. trachomatisstrain D/UW-3/Cx.

[0013] The polymorphic repetitive sequence can be a simple sequencerepeat (SSR); a small close or tandem repeat (TR); or a large repeat(LR). Exemplary SSRs and their locations are listed in Tables 1, 5, 9,13, and 16, below. The locations of exemplary TRs and LRs are listed inTables 2-4, 6-8, 10-12, 14, 15, and 17-19.

[0014] The biological sample can be a biopsy sample, blood, serum,peripheral blood mononuclear cells, cerebrospinal fluid, urine, nasalsecretion, saliva, or any other biological sample that may containchlamydiae. The method of detecting the presence of a polymorphicrepetitive sequence can include any suitable polynucleotide detectionstep, e.g., by amplification of polynucleotide molecules that contain apolymorphic repetitive sequence.

[0015] By “chlamydia” or “chlamydiae” is meant organisms of the orderChlamydiales. Examples include, but are not limited to, C. psittaci, C.trachomatis, C. pecorum, C. abortus, C. caviae, C. felis, C. suis, C.muridarum, N. hartmannellae, P. acanthamoebae, S. negevensis, and W.chondrophila. By “chlamydial infection” is meant an infection of a cellor organism by an organism of the order Chlamydiales.

[0016] By “polypeptide” is meant any chain of more than two amino acids,regardless of post-translational modification such as glycosylation orphosphorylation.

[0017] In another aspect, the invention features a purified polypeptidethat is substantially identical to a POMP2 polypeptide of SEQ ID NO: 3,SEQ ID NO: 4, or SEQ ID NO: 5, or a POMP4 polypeptide of SEQ ID NO: 6,SEQ ID NO: 7, or SEQ ID NO: 8.

[0018] In a related aspect, the invention features a purifiedpolynucleotide encoding a polypeptide that is substantially identical toa POMP2 polypeptide of SEQ ID NO: 3, SEQ ID NO: 4, or SEQ ID NO: 5, or aPOMP4 polypeptide of SEQ ID NO: 6, SEQ ID NO: 7, or SEQ ID NO: 8.

[0019] In still another aspect, the invention features a method ofidentifying a compound useful for treating or preventing an infection ofC. pneumoniae. This method includes the steps of: (a) contacting acandidate compound and a POMP polypeptide; and (b) determining thespecific binding of the candidate compound to the POMP polypeptide. Acandidate compound that specifically binds to the POMP polypeptide isidentified as a compound useful for treating or preventing an infectionof C. pneumoniae.

[0020] The invention features another method of identifying a compounduseful for treating or preventing an infection of C. pneumoniae. Thismethod includes the steps of: (a) contacting a candidate compound and aPOMP polynucleotide; and (b) determining the specific binding of thecandidate compound to the polynucleotide, wherein a candidate compoundthat specifically binds to the polynucleotide is identified as acompound useful for treating or preventing an infection of C.pneumoniae.

[0021] The invention also features a method of immunizing a subjectagainst an infection of C. pneumoniae by administering to the subject apurified POMP polypeptide or an immunogenic fragment thereof in anamount sufficient to induce an immune response to the POMP polypeptideor fragment thereof.

[0022] In still other aspects, the invention features a peptide fragmentof a POMP2 or POMP4polypeptide, an isolated antibody that specificallybinds a POMP2 or POMP4 polypeptide, an antigenic composition thatincludes a POMP2 or POMP4 polypeptide (or a fragment thereof) and apharmaceutically acceptable carrier or diluent, and a pharmaceuticalcomposition that includes an antibody that specifically binds a POMP2 orPOMP4 polypeptide and a pharmaceutically acceptable carrier or diluent.

[0023] The invention also features a method of producing an immuneresponse in an animal by immunizing the animal with an effective amountof a POMP polypeptide (e.g., a POMP2 or POMP4 polypeptide) or a peptidefragment of a POMP polypeptide.

[0024] POMP polypeptides that are a part of the invention include thosethat are substantially identical to C. pneumoniae POMP2 or POMP4 (FIGS.2A-2C and 3A-3C, respectively). POMP polynucleotides that are a part ofthe invention include those encoding POMP polypeptides as defined above,as well as polynucleotides substantially identical to POMP1, POMP2,POMP3, POMP4, POMP5, POMP6, or POMP7 (FIG. 1).

[0025] By “substantially identical” is meant a polypeptide orpolynucleotide exhibiting at least 95%, 99%, 99.5%, or 99.9%, identityto a reference amino acid or polynucleotide sequence. For polypeptides,the length of comparison sequences will generally be at least 16 aminoacids, preferably at least 20 amino acids, more preferably at least 25amino acids, and most preferably 35 amino acids. For polynucleotides,the length of comparison sequences will generally be at least 50nucleotides, preferably at least 60 nucleotides, more preferably atleast 75 nucleotides, and most preferably 110 nucleotides.

[0026] Sequence identity is typically measured using sequence analysissoftware with the default parameters specified therein (e.g., BLAST 2(Tatusova et al., FEMS Microbiol Lett. 174:247-250, 1999); SequenceAnalysis Software Package of the Genetics Computer Group, University ofWisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis.53705). These programs match similar sequences by assigning degrees ofhomology to various substitutions, deletions, and other modifications.Conservative substitutions typically include substitutions within thefollowing groups: glycine, alanine, valine, isoleucine, leucine;aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine;lysine, arginine; and phenylalanine, tyrosine.

[0027] By “high stringency conditions” is meant hybridization in 2+ SSCat 40° C. with a DNA probe length of at least 40 nucleotides. For otherdefinitions of high stringency conditions, see F. Ausubel et al.,Current Protocols in Molecular Biology, pp. 6.3.1-6.3.6, John Wiley &Sons, New York, N.Y., 1994, hereby incorporated by reference.

DESCRIPTION OF THE DRAWINGS

[0028]FIG. 1 is a schematic illustration showing a family of POMPelements, their positions in three strains of C. pneumoniae, the numberof cytidines in the SSR, and the genes annotated for their region.

[0029]FIG. 2A is a schematic illustration showing the amino acidsequence of POMP2 from C. pneumoniae strain CWL-029.

[0030]FIG. 2B is a schematic illustration showing the amino acidsequence of POMP2 from C. pneumoniae strain J138.

[0031]FIG. 2C is a schematic illustration showing the amino acidsequence of POMP2 from C. pneumoniae strain AR 39 .

[0032]FIG. 3A is a schematic illustration showing the amino acidsequence of POMP4 from C. pneumoniae strain CWL-029 .

[0033]FIG. 3B is a schematic illustration showing the amino acidsequence of POMP4 from C. pneumoniae strain J138.

[0034]FIG. 3C is a schematic illustration showing the amino acidsequence of POMP4 from C. pneumoniae strain AR 39.

[0035] Other features and advantages of the invention will be apparentfrom the following description of the preferred embodiments thereof.

DETAILED DESCRIPTION OF THE INVENTION

[0036] Using algorithms designed to search for different types ofrepeats, we identified three classes of statistically significantrepeats in the complete genomes of sequenced chlamydiae species (C.pneumoniae CWL-029, AR 39, J138, C. trachomatis D/UW-3/Cx, C.muridarum). These include (1) simple sequence repeats (SSRs); (2) smallclose or tandem repeats (TRs); and (3) large repeats (LRs). TRs and SSRsare thought to change by slipped-mispair at the time of replication orby single-strand annealing when the sequence faces double-strand breaks.Both mechanisms can result in conversion or deletion, but slippedmispair may also result in multiplication. LRs are thought to vary byhomologous recombination, and this can lead to conversion or deletion.Additionally, recombination between direct LRs can result inmultiplication, whereas recombination between inverted LRs may result ininversion. Hence, different repeats represent different recombinationpotentials that may result in substantially different outputs.

[0037] We have found when looking at a large collection of strains thatthese repeated sequences are polymorphic. Thus, the pmp_(—)10.2 cytosinestretch has been shown to be variable within C. pneumoniae strains,resulting in a shift out of frame in CWL-029 but not in AR 39 or TW- 183(Grimwood et al., Infect. Immun. 69:2383-2389, 2001). Another differencebetween C. pneumoniae strains is a 393 nucleotide sequence (coding for131 amino acids) in the 5′ part of pmp_(—)6, which is present threetimes in CWL-029 and J138 but only two times in AR 39.

[0038] These polymorphisms can be used as molecular markers that mightdifferentiate strains bearing a conserved MOMP. The identification ofsubgroups within these groups (ruminant invasive C. psittaci, feline C.psittaci, and C. pneumoniae) should allow the search for correlationswith the virulence and the different observed clinical syndromes.

[0039] Polymorphisms within simple and tandem repeats, according totheir position in a coding or non-coding region, will generate a stopcodon (if the modification in length is not a multiple of three) or amodification of the length of the promoter. Both mechanisms will lead toa modulation of the functional protein. The presence of polymorphismsallows for the identification of particular strains based on thepresence of a particular polymorphism or pattern of polymorphisms.

[0040] Diagnostic Assays

[0041] As the presence of a particular polymorphic repetitive sequenceis likely to correlate with the presence of a particular strain ofchlamydiae, the invention features a method for determining the presenceof one or more strains of chlamydiae in a patient. In the methods of theinvention, a sample from an individual, such as an individual who issuspected of having a chlamydial infection or a disease associated witha chlamydial infection, is used. The test sample can include blood,serum, cerebrospinal fluid, urine, nasal secretion, saliva, or any otherbodily fluid or tissue, or polynucleotides isolated from one of theforegoing samples.

[0042] The sample can be assayed for the presence or absence of thepolymorphic repetitive sequence by Southern hybridization using adetectable probe for the appropriate polymorphic repetitive sequence.Alternatively, the test sample can be assayed using quantitative PCR orRT-PCR (e.g., by using a LightCycler™ (Idaho Technology Inc., IdahoFalls, Id.) and fluorescent LightCycler™ probes). The presence of thepolymorphic repetitive sequence in the test sample is indicative of thepresence of chlamydiae in the test sample. To facilitate assaying a testsample for the presence or absence of chlamydiae by detecting thepresence or absence of a polymorphic repetitive sequence, the testsample can be subjected to methods to enhance isolation of chlamydiaelementary bodies from the test sample and to release DNA from theelementary bodies. For example, elementary bodies have a tendency toadhere to the walls of a receptacle containing them; the elementarybodies can be removed from the receptacle by treating the receptaclecontaining the elementary bodies with trypsin/EDTA, thereby releasingelementary bodies that adhered to the receptacle; and then concentratingthe released elementary bodies, such as by centrifugation or filtration.To release DNA from elementary bodies, the elementary bodies areincubated under disulfide reducing conditions, such as incubating theelementary bodies with a disulfide reducing agent such as dithiothreitol(DTT) or 2-mercaptoethanol; and digesting the elementary bodies with aprotease (see, e.g., U.S. Pat. No. 6,258,532, hereby incorporated byreference).

[0043] The test sample can also be assayed for the presence ofchlamydiae by detecting the presence of a polymorphic repetitivesequence in a protein from chlamydia. For example, the presence of a PMPprotein having a particular polymorphic repetitive sequence in the testsample can be detected through the use of ELISA methodologies with anantibody that specifically recognizes the polymorphic repetitivesequence. Alternatively, the test sample may be assayed for the presenceof chlamydiae by detecting the presence of human antibodies topolymorphic repetitive sequences in the test sample. The presence of apolymorphic repetitive sequence or antibodies to a polymorphicrepetitive sequence in the test sample is indicative of the presence ofchlamydiae in the test sample. The presence of proteins or antibodiesmay be detected by appropriate methods such as by ELISA, western blot,or isoelectric focusing.

[0044] The diagnostic methods described herein are useful for detectingor confirming the disease in a patient, as well as for monitoring theprogress of the disease. Disease monitoring is useful, for example, fordetermining the efficacy of a particular therapy.

[0045] The invention will be further illustrated by the followingnon-limiting examples.

[0046] Identification of Polymorphic Repetitive Sequences

[0047] The tables containing the elements found in the five chlamydiagenomes follow below ordered by genome and by repeat type. The GenBankaccession identification numbers are as follows: AE001363 (C. pneumoniaeCWL-029; Kalman et al., Nature Genet. 21:385-389, 1999); AE002161 (C.pneumoniae AR 39; Read et al., Nucleic Acids Res. 28:1397-1406, 2000);BA000008 (C. pneumoniae J138; Shirai et al., Nucleic Acids Res.28:2311-2314, 2000); AE001273 (C. trachomatis; Stephens et al., Science282:754-759, 1998); AE002160 (C. muridarum; Read et al., Nucleic AcidsRes. 28:1397-1406, 2000). “ID” indicates an identification tag,“position” indicates the position of the start of the repeat in therespective genomes, “length” indicates the repeat length, “gene”indicates the gene where the repeat was found (if applicable), “sense”indicates if the repeat in the direct (d) or inverse (i) strand,“equivalent to” indicates the equivalent elements of the other genomes,“note” includes either the strand of the gene where the repeat stands orthe flanking genes, in which case “D/C” stands for the position of thegenes (direct or complement strands), and UFO indicates an unknownfunction ORF. For large repeats, “first” refers to the first occurrenceof the repeat and “second” to the second occurrence. “Period” is in theform A×B, wherein “A” indicates the number of times the motif isrepeated and “B” indicates the length of repeat. By “-” is meant that aconsensus cannot be established to determine A and B with precision.TABLE 1 Chlamydia pneumoniae strain CWL-029 SSRs ID Position length genesense note equivalent to C(G)_(N)(N>11)  C1 10806 14 INT d D/UFO/CPn0007D/UFO/CPn0008 J1, A4  C2 13350 14 INT d D/UFO/CPn0009 D/UFO/CPn0010 J2 C3 20588 14 pmp_2 d D A3, J3  C4 58474 14 CPn0043 d D A2, J4  C5 8533614 CPn0069 d D —  C6 507200 13 pmp_10.2 d C J5, A1  C7 1207061 13CPn1054 d D J6  C8 1209609 12 INT d D/UFO/CPn1054 D/UFO/CPn1055 A5, J7ACC_(N) /CAC_(N)(N>3)  C9 628400 14 CPn0542 d D/ABC transporter J8, A9TCC_(N) (N>4) C10 1150530 15 ftsH d D/protease J10, A8 TTC_(N) (N>4) C11956212 15 yphC d C/GTPase J11, A7 CGT_(N)/GTC_(N) (N>3) C12 607260 13CPn0525 D/UFO J9, A6, TR4, M7 ATGCT_(N)(N>2) C13 258158 15 ypdP d D/UFOJ12, A11 ATTAA_(N) (N>2) C14 407929 15 INT d C/sigma/rpsD C/flagelarsecretion/flhA J13 TTTCT_(N) (N>2) C15 396387 15 CPn0352 d D/UFO J14,A10

[0048] TABLE 2 Chlamydia pneumoniae strain CWL-029 TRs id Positionlength period genes note equivalent to  C1 7547 937  3 × 330 INTC/UFO/CPn0006 D/UFO/CPn0007 A12  C2 10807 178 2 × 89 INT D/UFO/CPn0007D/UFO/CPn0008 - (A12?)  C3 240764 45 2 × 13 INT D/oppA_4 D/oppB_1 J2,A11  C4 255447 35 2 × 14 INT C/UFO/CPn0214 A9, J4  C5 278045 26 ⅔ × 8 INT C/CPn0240 C/CPn0241 A8, J5  C6 341108 32 2 × 15 lpxDD/UDP-acyltransferase J6, A7  C7 379100 40 — INT many erased, C/CPn0333D/CPn0334 A6, J7  C8 432020 330 — hctB many erased, C/histone like J8,A5, M1, TR1  C9 451458 30 2 × 15 CPn0405 C/UFO J9, A4 C10 492298 20 4 ×6  pmp_6 D J10, A3 C11 568858 55 2 × 18 CPn0487 C/UFO A2, J11 C12 66222480 — INT many erased, C/murA D/UFOCPn0572 J12 A1 M9 TR4 C13 916873 70 —CPn0809 many erased C/UFO J13, A16, M11, TR7 C14 984289 30 2 × 13 rodAD/rod shape protein J14, A15 C15 1028449 26 2 × 13 CPn0897C/phosphohydrolase J15, A14 C16 1085124 18 3 × 6  glgA C/glycogensynthase J16, A13

[0049] TABLE 3 Chlamydia pneumoniae strain CWL-029 LRs (inverse) idFirst second length first Second equivalent to C1 207095 208884 35D/CPnO165/UFO C/CPnO169IUFO J1 A1 C2 493543 506266 23 D/pmp_6 INT J3 A3C3 954974 955029 32 C/CPn0843 C/CPn0843 J2 A2

[0050] TABLE 4 Chlamydia pneumoniae strain CWL-029 LRs (direct) id Firstsecond length first Second equivalent to C1 26238 29415 23 D/pmp_4.2D/pmp_5.2 — C2 234959 236693 27 D/oppA_1 D/oppA_2 J1 A10 C3 259232259385 26 INT D/tgt/tRNA transferase J2 A9 C4 290023 292838 40C/CPn0255/UFO INT J3 A8 C5 415142 416513 31 D/CPn0369/UFO D/CPn0370/UFOJ4 A6 C6 495909 498766 23 D/pmp_7 D/pmp_8 J6 A4 C7 501979 514804 24D/pmp_9 D/pmp_13 J7 A3 C8 522778 525176 28 C/CPn0457/UFO C/CPn0458/UFOJ8 A2 C9 528528 530945 29 C/CPn0461/UFO C/CPn0462/UFO J9 A1 C10 11116301113279 1650 D/glmS/amynotransferase D/yccA_transport trunc A13D/tyrP_1/transport D/tyrP_2/transport

[0051] TABLE 5 Chlamydia pneumoniae strain AR 39 SSRs id Position lengthgene sense Note equivalent to G(C)_(N)(N>11)  A1 334377 13 CP0303 dD/PmpG C6  A2 782709 14 CP0730 d C/UFO C4 J4  A3 820588 14 CP0761 dC/PmpG C3 J3  A4 830377 14 INT d C/CP0766/UFO C/CP0767/UFO J1 C1  A5861807 15 INT C/CP0795 C/CP0796 C8 J7 ACG_(N)/GAC_(N) (N>3)  A6 23431413 CP0228 d C/UFO J9 C12 TR4 M7 GAA_(N) (N>4)  A7 1115211 15 CP1025 dD/GTP_binding J11 C11 GGA_(N) (N>4)  A8 920865 15 CP0857 d C/FtsH C10J10 GGT_(N)/GTG_(N) (N>3)  A9 213173 13 INT d C/CP0209 C/CP0211 ABCtransporters C9 J8 AGAAA_(N)(N>2) A10 444793 15 INT d C/CP0406/UFOC/CP0408/ATP carrier J14 C15 AGCAT_(N) (N>2) A11 583107 15 CP0548 dC/UFO J12 C13 TTAAT_(N) (N>2) A12 433252 15 CP0415 d D/reductoisomerase—

[0052] TABLE 6 Chlamydia pneumoniae strain AR 39 TRs id Position lengthperiod genes Note equivalent to  A1 179248 135 — INT C/CP0177/UFOD/CP0178/transferase J12 C12 M9 TR4  A2 272675 43 ⅔ × 13 CP0267 D/UFOC11 J11  A3 349273 18 3 × 6  CP0309 C/PmpG J10 C10  A4 389712 30 2 × 15CP0350 D/UFO C9 J9  A5 408951 235 — CP0371 D/Nucleoprotein J8 C8 M1 TR1 A6 462041 145 — INT C/UFO/CP0424 D/UFO/CP0425 C7 J7  A7 500140 33 2 ×15 CP0456 C/UDP-transferase C6 J6  A8 563145 200 — INT D/UFO/CP0521D/UFO/CP0522 C5 J5  A9 585797 35 2 × 14 CP0551 D/UFO C4 J4 A10 587034 24¾ × 7  INT D/UFO/CP0551 D/UFO/CP0553 (J3?) A11 600477 46 2 × 13 INTC/UFO/CP0568 C/UFO/CP0569 C3 J2 A12 832710 990  3 × 330 CP0769 C/UFO C1A13 986268 18 3 × 6  CP0911 D/glycogen synthase J16 C16 A14 1042952 37 2× 13 CP0969 D/UFO J15 C15 A15 1087106 43 2 × 13 CP1002 C/MrdB J14 C14A16 1154483 40 — CP1062 D/UFO C13 J13 M11 TR7

[0053] TABLE 7 Chlamydia pneumoniae strain AR 39 LRs (inverse) id Firstsecond length First second equivalent to A1 632368 634157 35D/CP0602/UFO C/CP0606/UFO J1 C1 A2 1116377 1116432 32 D/CP1026 D/CP1026frameshifted C3 J2 A3 335302 348025 23 D/CP0303/pmpG C/CP0309/pmpG C2 J3

[0054] TABLE 8 Chlamydia pneumoniae strain AR 39 LRs (direct) id firstsecond length First second equivalent to A1 310615 313032 29D/CP0290/UFO D/CP0291/UFO C9 J9 A2 316385 318783 28 D/CP0294/UFOD/CP0295/UFO C8 J8 A3 326762 339588 24 C/CP0299/pmpG C/CP0306/pmpG J7 C7A4 342802 345659 23 C/CP0307/pmpG C/CP0308/pmpG J6 C6 A5 349362 34975547 C/CP0309/pmpG C/CP0309/pmpG J5 A6 424652 426023 31 C/CP0387/UFOC/CP0388/UFO J4 C5 A7 540623 541657 449/365 C/CP0493/UFO C/CP0495/UFO —A8 548403 551218 40 INT D/CP0506/UFO J3 C4 A9 581869 582022 26C/CP0546/tRNA transferase INT J2 C3 A10  604567 606301 27 C/CP0571/ABCtr C/CP0572/ABC tr J1 C2 A11  811759 814936 24 INT INT — A12  947563948860 1144 C/CP0878/UFO C/CP0879/UFO — A13  956482 958131 1650C/CP0888/UFO C/CP0891/permease C10 C/CP0889/permease

[0055] TABLE 9 Chlamydia pneumoniae strain J138 SSRs id position lengthgene sense note equivalent to C(G)_(N)(N>11) J1 10806 14 INT dD/UFO/CPj0007 D/UFO/CPj0008 C1 A4 J2 13350 13 INT d D/UFO/CPj0009D/UFO/CPj0010 C2 J3 20597 13 pmp_2_1 d D C3 A3 J4 58475 14 CPj0043 dD/UFO C4 A2 J5 506847 14 pmp_10 d C C6 A1 J6 1205090 12 CPj1054 d D/UFOC7 J7 1207641 16 INT d D/UFO/CPj1054 D/UFO/CPj1055 grey hol C8 A5ACC_(N)/CAC_(N) (N>3) J8 628050 14 CPj0542 d D/ABC transp C9 A9CGT_(N)/GTC_(N) (N>3) J9 606910 14 CPj0525 d D/UFO C12 A6, TR4, M7TCC_(N) (N>4) J10  1148562 15 ftsH d D C10 A8 TTC_(N)/TTC_(N) (N>4) J11 955863 15 yphC d C/GTPase C11 A7 ATGCT_(N)(N>2) J12  258110 15 ypdP dD/UFO C13 A11 ATTAA_(N) (N>2) J13  407968 15 flhA d C C14 TTTCT_(N)(N>2) J14  396425 15 CPj0352 d D C15 A10

[0056] TABLE 10 Chlamydia pneumoniae strain J138 TRs id position lengthperiod genes note equivalent to J1 127027 80 2 × 40 htrB_1C/acyltransferase — J2 240709 38 2 × 13 INT D/oppA_4 D/oppB_1 C3 A11 J3254172 28 4 × 7  CPj0213 C/UFO - (A10?) J4 255396 38 2 × 17 CPj0214C/UFO A9 C4 J5 277997 60 — INT C/UFO/CPj0240 C/UFO/CPj0241 C5 A8 J6341060 30 2 × 15 lpxD D/UDP-acyltransferase C6 A7 J7 379052 140 — INTC//ltuB/CPj0333 D/CPj0334 A6 C7 J8 432059 300 — hctB C/histone like C8A5 M1 TR1 J9 451497 30 2 × 15 CPj0405 C/UFO C9 A4 J10  491945 25 ¾ × 6 pmp_6 D C10 A3 J11  568506 56 2 × 18 CPj0487 C/UFO A2 C11 J12  661875120 — INT C/murA D/CPj0572 C12 A1 M9 TR4 J13  916525 80 — CPj0809 C/UFOA16 C13 M11 TR9 J14  983940 43 2 × 13 rodA D C14 A15 J15  1028100 37 2 ×13 CPj0897 C/phosphoydrolase C15 A14 J16  1084803 18 3 × 6  glgAC/glycogen synthase C16 A13

[0057] TABLE 11 Chlamydia pneumoniae strain J138 LRs (inverse) id Firstsecond length first second equivalent to J1 207048 208837 35D/CPj0165/UFO C/CPj0169/UFO C1 A1 J2 954625 954680 32 INT-C/CPj0843/UFOC/CPj0843/UFO C3 A2 J3 493190 505913 23 D/pmp_6 C/pmp_10 C2 A3

[0058] TABLE 12 Chlamydia pneumoniae strain J138 LRs (direct) id firstsecond length first second equivalent to J1 234904 236638 27 D/oppA_1D/oppA_2 C2 A10 J2 259184 259337 26 INT D/tgt C3 A9 J3 289975 292790 40C/CPj0255/UFO C/CPj0259/UFO C4 A8 J4 415181 416552 31 D/CPj0369/UFOD/CPj0370/UFO C5 A6 J5 491436 491829 47 D/pmp_6 D/pmp_6 A5 J6 495556498413 23 D/pmp_7 D/pmp_8 C6 A4 J7 501626 514452 24 D/pmp_9 D/pmp_9(2pmp9...) C7 A3 J8 522427 524825 28 C/CPj0457/UFO C/CPj0458/UFO C8 A2J9 528177 530594 29 C/CPj0461/UFO C/CPj0462/UFO C9 A1

[0059] TABLE 13 Chlamydia trachomatis strain D/UW-3/Cx SSRs id positionlength gene sense note equivalent to C(G)_(N)(N>11) TR1 291810 12 INT dC/CT259 D/CT260 — GT_(N) (N>5) TR2 964233 12 ftsY d C/cell division —ATT_(N) (N>4) TR3 1008839 15 CT857 d D/UFO — CGT_(N) (N>3) TR4 456967 15CT398 d D/UFO M7, J9 A6, C12 GCA_(N) (N>4) TR5 531772 15 CT456 D/UFO —TGCAA_(N) (N>2) TR6 687502 15 uvrD d D —

[0060] TABLE 14 Chlamydia trachomatis strain D/UW-3/Cx TRs id positionLength period Genes note equivalent to TR1 51545 400 15 bp - hctB manyerased, D/histone like M1 J8 A5 C8 TR2 511072 53 2 × 15 tsp D/protease —TR3 527891 53 2 × 17 argS D/tRNA transferase — TR4 531363 450 3 × 150CT456 D/UFO J12 A1 C12 M9 TR5 532487 18 3 × 6 CT456 D/UFO — TR6 61389118 3 × 6 dnaE D/DNA pol III — TR7 650720 140 — CT578 D/UFO M11 J13 A16C13 TR8 657611 58 2 × 13 gp6D D/UFO/plasmid paralog — TR9 861061 40 2 ×16 CT741 C/UFO — TR10  984536 45 2 × 13 INT D/tRNASer_4 D/CT837 —

[0061] TABLE 15 Chlamydia trachomatis strain D/UW-3/Cx LRs (direct) idfirst second Length first second TR1 485249 574902 25 tRNASer_3TRNASer_2 TR2 853782 875828 5474 rRNA + tRNA rRNA + tRNA

[0062] TABLE 16 Chlamydia muridarum SSRs id position length gene senseNote equivalent to C(G)_(N)(N>11) M1 501838 12 TC0436 d C/phospholipase— M2 496505 12 INT i C/TC0432 C/TC0433 phospholipases — M3 542159 13TC0447 i D/phospholipase — AC_(N) (N>5) M4 787670 12 TC0662 d D/UFO —ACA_(N) (N>4) M5 1001176 15 TC0868 d D/UFO — AGC_(N) (N>4) M6 212303 17TC0181 d C — CGT_(N) (N>3) M7 807276 15 TC0677 d D/UFO J9 C12 TR4 A6CCTCC_(N) (N>2) M8 889812 15 TC0750 d D/UFO — GAGAG_(N) (N>2) M9 27261615 TC0235 d C/UFO —

[0063] TABLE 17 Chlamydia muridarum TRs id position length period genesnote similar to M1 369850 450 — TC0337 C/UFO TR1 J8 A5 C8 M2 452977 50 4× 14 TC0392 D/UFO — M3 602448 48 2 × 21 TC0500 C/UFO — M4 715922 975  5× 201 TC0602 C/helicase — M5 738193 37 2 × 17 TC0618 C/dehydrogenase —M6 756226 55 2 × 13 TC0634 C/UFO — M7 758966 22 2 × 13 TC0635 C/UFO — M8871834 44 2 × 13 TC0733 C/SecDF — M9 881447 930  3 × 330 TC0741 D/UFOJ12 A1 TR4 C12 M10  985017 23 2 × 13 INT D/TC0850/type III secretion —D/TC0853/type III membrane M11  1000266 80 — TC0867 D/UFO TR7 J13 C13A16 M12  1036721 18 3 × 6 TC0898 D/helicase/uvrD —

[0064] TABLE 18 Chlamydia muridarum LRs (inverse) first second lengthFirst Second M1  93051 985895 23 C/TC0080/trigger factor D/TC0853/typeIII M2 495386 533316 1150 C/TC0432/phospholipase D/TC0440/phospholipaseM3 497071 533316 978 C/TC0433/phospholipase D/TC0440/phospholipase

[0065] TABLE 19 LRs (direct) first second length First Second M1 133478151897 1050 D/TC0113/UFO-INT INT-rRNA M2 134545 156503 800C/TC0114/UFO-INT C/TC0130/UFO-INT M3 236729 238122 25 D/TC0204/permeaseD/TC0205/permease M4 495294 496810 1244 C/TC0432/phospholipaseC/TC0433/phospholipase M5 503667 513542 23 D/TC0437/adherenceD/TC0438/adherence M6 539556 540091 119 D/TC0444/UFO C/TC0445/UFO M7834344 923737 1050 C/tRNA-Ser-3 C/tRNA-Ser-4 D/TC0696/ABC transportD/TC0784/helicase

[0066] Some interesting features can be observed from these data. First,repeated sequences are more frequent in C. pneumoniae strains than innon-C. pneumoniae strains. This is true for SSRs, TRs, and direct LRs(t-student test, P<0.05 for SSR and TR and P<0. 1 for LDR) (Table 20).TABLE 20 SSR TR LDR LIR Multiplets Cpn CW 15 16 10 3 1 Cpn A 12 16 13 30 Cpn J 14 16 9 3 1 Ctr 6 10 2 0 0 Cmu 9 12 7 3

[0067] Second, several of these repeated sequences fall within the pmplocus. Indeed, even if larger numbers of LRs in C. pneumoniae can beattributed to the Pmp proteins, the larger numbers of SSRs and TRs aretypically outside of these elements and possibly reflect other variationstrategies. Third, this approach allowed us to discover a new family ofseven genes encoding proteins that we call POMPs for polymorphic outermembrane proteins (FIG. 1).

[0068] Characterization of the POMPs

[0069] We performed a similarity search, motif analysis, and detectionof transmembrane domains.

[0070] BLAST searches on the complete GenBank/EMBL/DDBJ databaseprovided for no significant hits at E<10⁻¹⁰, except for C. pneumoniaesequences. The same result was observed when we performed a full searchfor orthologues in completely sequenced genomes (including C.trachomatis and C. muridarum). Finally, we carried out BLAST searches onthe TIGR database of unfinished genomes, and against the fullysequenced, but still non-annotated genome of C. psitacci, also withoutpositive results. Based on our searches, we concluded that POMP elementswere specific to C. pneumoniae, perhaps having horizontally transferredafter divergence with the other fully sequenced chlamydiae.

[0071] The analysis of the amino acid content revealed an excess of someresidues, including cysteine, a residue that is characteristic of outermembrane proteins of C. pneumoniae (e.g. in Pmp; Melgosa et al, FEMSLett., 112:199-204). We then determined whether the hydrophobicityprofile, presence of putative transmembrane domains, and von Heijne'smethod for signal sequence recognition agreed in the prediction of asignal peptide. These methods indicated a signal peptide domain thatwould be cleaved at residue 51. We then used Klein's method fortransmembrane region allocation, which predicted a transmembrane domainin residues 68-84. A similar result was obtained by using Top-pred.Using MTOP, we then predicted the membrane topology of the peptide.Results indicated that the N-terminal side should be inside, and theC-terminus outside. Thus, bioinformatic analyses consistently suggestedthat the POMP peptide was a membrane protein with one transmembranesegment, and a cytoplasmic N-terminus.

[0072] The putative amino acid sequences of POMP2 and POMP4 polypeptidesare depicted in FIGS. 2A-2C and 3A-3C, respectively. The correspondingpolynucleotide sequences are found at the region of the annotatedsequence indicated in FIG. 1.

[0073] The identification of POMPs as a multigenic family restricted toC. pneumoniae strains implicates the POMP polynucleotides andpolypeptides as being useful in the development of therapeutic anddiagnostic agents, as described below.

[0074] Antibodies

[0075] The POMP polypeptides and polynucleotides of the invention (orvariants thereof) or cells expressing the same can be used as immunogensto produce antibodies immunospecific for such polypeptides orpolynucleotides respectively. Antibodies generated against POMPpolypeptides or polynucleotides can be obtained by administering thepolypeptides and/or polynucleotides, or epitope-bearing fragments ofeither or both, analogues of either or both, or cells expressing eitheror both, to an animal, preferably a nonhuman, using routine protocols.For preparation of monoclonal antibodies, any technique known in the artthat provides antibodies produced by continuous cell line cultures canbe used. Techniques for the production of single chain antibodies (U.S.Pat. No. 4,946,778) can be adapted to produce single chain antibodies topolypeptides or polynucleotides of this invention. Additionally,transgenic mice, or other organisms such as other mammals, may be usedto express humanized antibodies immunospecific to the POMP polypeptidesor polynucleotides of the invention. Phage display technology may bealso utilized to select antibody genes with binding activities towards aPOMP polypeptide of the invention, either from repertoires of PCRamplified v-genes of lymphocytes from humans screened for possessinganti-POMP, or from naive libraries. The affinity of these antibodies canalso be improved by, for example, chain shuffling.

[0076] The above-described antibodies may be employed to isolate or toidentify clones expressing a POMP polypeptide or polynucleotide of theinvention to purify the polypeptide or polynucleotide by, for example,affinity chromatography. Antibodies against a POMP polypeptide or POMPpolynucleotide may be employed to treat infections of C. pneumoniae.

[0077] In accordance with an aspect of the invention, there is providedthe use of a POMP polynucleotide of the invention for therapeutic orprophylactic purposes, in particular genetic immunization. Among theparticularly preferred embodiments of the invention are naturallyoccurring allelic variants of POMP polynucleotides and polypeptidesencoded thereby. The use of a polynucleotide of the invention in geneticimmunization will preferably employ a suitable delivery method such asdirect injection of plasmid DNA into muscles, delivery of DNA complexedwith specific protein carriers, coprecipitation of DNA with calciumphosphate, encapsulation of DNA in various forms of liposomes, particlebombardment, or in vivo infection using cloned retroviral vectors.

[0078] Drug Screening

[0079] POMP polypeptides and polynucleotides of the invention may alsobe used to assess the binding of small molecule substrates and ligandsin, for example, cells, cell-free preparations, chemical libraries, andnatural product mixtures. These substrates and ligands may be naturallyoccurring or may be structural or functional mimetics. In general,antagonists of POMP function may be employed for therapeutic andprophylactic purposes for treating infections of C. pneumoniae. Thescreening methods may simply measure the binding of a candidate compoundto a POMP polypeptide or polynucleotide, or to cells or membranesbearing the polypeptide or polynucleotide, or a fusion protein of thepolypeptide by means of a label directly or indirectly associated withthe candidate compound. Alternatively, the screening method may involvecompetition with a labeled competitor. Further, these screening methodsmay test whether the candidate compound results in a signal generated byactivation or inhibition of the POMP polypeptide, using detectionsystems appropriate to the cells expressing the POMP polypeptide.Inhibitors of activation are generally assayed in the presence of aknown agonist and the effect on activation by the agonist by thepresence of the candidate compound is observed.

[0080] POMP polypeptides may be used to identify membrane bound orsoluble receptors, if any, for such polypeptide, through standardreceptor binding techniques known in the art. These techniques include,but are not limited to, ligand binding and crosslinking assays in whichthe polypeptide is labeled with a radioactive isotope (for instance,¹²⁵I), chemically modified (for instance, biotinylated), or fused to apeptide sequence suitable for detection or purification, and incubatedwith a source of the putative receptor (e.g., cells, cell membranes,cell supernatants, tissue extracts, bodily materials). Other methodsinclude biophysical techniques such as surface plasmon resonance andspectroscopy. These screening methods may also be used to identifyagonists and antagonists of the polypeptide that compete with thebinding of the polypeptide to its receptor(s), if any. Standard methodsfor conducting such assays are well understood in the art.

[0081] Vaccines

[0082] The invention provides a method for inducing an immunologicalresponse in an individual, particularly a mammal, by inoculating theindividual with a POMP polynucleotide and/or polypeptide, or a fragmentor variant thereof, adequate to produce antibody and/or T cell immuneresponse to protect that individual from an infection of C. pneumoniae.

[0083] A polypeptide of the invention may be used as an antigen forvaccination of a host to produce specific antibodies which protectagainst invasion of C. pneumoniae, for example by blocking adherence ofbacteria to damaged tissue. Examples of tissue damage include wounds inskin or connective tissue caused, for example, by mechanical, chemical,thermal or radiation damage or by implantation of indwelling devices, orwounds in the mucous membranes, such as the mouth, throat, mammaryglands, urethra, or vagina.

[0084] The invention also includes a vaccine formulation that includesan immunogenic recombinant polypeptide and/or polynucleotide of theinvention together with a suitable carrier, such as a pharmaceuticallyacceptable carrier. Since the polypeptides and polynucleotides may bebroken down in the stomach, each is preferably administeredparenterally, including, for example, administration that issubcutaneous, intramuscular, intravenous, or intradermal. Formulationssuitable for parenteral administration include aqueous and non-aqueoussterile injection solutions which may contain anti-oxidants, buffers,bacteriostatic compounds and solutes which render the formulationisotonic with the bodily fluid, preferably the blood, of the individual;and aqueous and non-aqueous sterile suspensions which may includesuspending agents or thickening agents. The formulations may bepresented in unit-dose or multi-dose containers, for example, sealedampoules and vials and may be stored in a freeze-dried conditionrequiring only the addition of the sterile liquid carrier immediatelyprior to use. The vaccine formulation may also include adjuvant systemsfor enhancing the immunogenicity of the formulation, such as oil-inwater systems and other systems known in the art. The dosage will dependon the specific activity of the vaccine and can be readily determined byroutine experimentation.

[0085] Diagnostics

[0086] Antibodies that specifically bind a POMP polypeptide may be usedfor the diagnosis of an infection of C. pneumoniae, or in assays tomonitor patients being treated for an infection of C. pneumoniae. Theantibodies useful for diagnostic purposes may be prepared in the samemanner as those described above for therapeutics. Diagnostic assays forPOMP polypeptides include methods that utilize the antibody and a labelto detect POMP polypeptides in human body fluids or extracts of cells ortissues. The antibodies may be used with or without modification, andmay be labeled by joining them, either covalently or non-covalently,with a reporter molecule. A wide variety of reporter molecules known inthe art may be used. A variety of detection protocols (e.g., ELISA, RIA,and FACS) are also known in the art and provide a basis for diagnosingan infection of C. pneumoniae on the basis of detection of a POMPpolypeptide.

[0087] POMP polynucleotides may also be used for diagnostic purposes.POMP polynucleotide sequences that may be used include antisense RNA andDNA molecules, and oligonucleotide sequences. The POMP polynucleotidesmay be used to detect and quantitate POMP expression in biopsiedtissues. The diagnostic assay may be used to monitor an infection of C.pneumoniae during therapeutic intervention.

[0088] POMP polynucleotides may be used in Southern or northernanalysis, dot blot, or other membrane-based technologies; in PCRtechnologies; or in dip stick, pIN, ELISA or chip assays utilizingfluids or tissues from patient biopsies to detect an infection of C.pneumoniae. Such qualitative or quantitative methods are well known inthe art.

[0089] POMP polynucleotides may be labeled by standard methods, andadded to a fluid or tissue sample from a patient under conditionssuitable for the formation of hybridization complexes. After a suitableincubation period, the sample is washed and the signal is quantitatedand compared with a standard value. If the amount of signal in thebiopsied or extracted sample is significantly altered from that of acomparable control sample, the labeled POMP polynucleotides havehybridized with polynucleotide sequences in the sample, indicating thepresence of C. pneumoniae in the sample. Such assays may also be used toevaluate the efficacy of a particular therapeutic treatment regimen inanimal studies, in clinical trials, or in monitoring the treatment of anindividual patient.

[0090] Once disease is established and a treatment protocol isinitiated, hybridization assays may be repeated on a regular basis toevaluate whether the expression in the patient is eliminated. Theresults obtained from successive assays may be used to show the efficacyof treatment over a period ranging from several days to months.

[0091] The foregoing results were obtained using the following methods.

[0092] Methods

[0093] An SSR is a strictly tandem repeat with n elements of a motif X(e.g., 3 CG in CGCGCG). Considering L, the length of the genome, theprobability of not finding X_(n) anywhere is given by the formula:

P=(1−f _(X) ^(n))^(L)

[0094] where f_(X) is the relative frequency of the motif X in thegenome. We used a threshold p-value of 0.01 and searched for significantSSR elements with motifs ranging in length from 1 to 5 nucleotides, inall genomes of chlamydiae using standard pattern matching methods.

[0095] LR minimal length was defined through the use of a statistic ofextremes that takes into account the composition in nucleotides and thelength of the genome. For chlamydiae, this value is in the range of 25nucleotides, which coincides with the minimal region of strict homologyrequired for homologous recombination in E. coli and B. subtilis.

[0096] A small repeat was kept if it had at least the minimalsignificant length and if its two or more copies occur at shortdistances (<1 kbp). We searched for such repeats in sliding windows of1000 bp, and for each window we computed the extreme statistics thatallowed the definition of the length threshold in the window. Thesevalues varied slightly from window to window (in function of the windowcomposition), but typically ranged from 12 to 14 bp. Then we inspectedfor more distinctive tandem repeats, by identifying repeats withoccurrences at less than 50 bp apart, and for those with copies distantless than three times their length and by eye-checking all the othersusing dot-plots.

1 8 1 4 PRT Artificial Sequence VARIANT 4 Xaa = L, V or I 1 Gly Gly AlaXaa 1 2 4 PRT Artificial Sequence VARIANT 2, 3 Xaa = any amino acid 2Phe Xaa Xaa Asn 1 3 334 PRT Chlamydia pneumoniae 3 Met Gln Val Leu LeuSer Pro Gln Leu Pro Pro Pro Pro Gln His Ser 1 5 10 15 Val Gly Ser IleSer Ser Pro Ser Lys Leu Arg Val Leu Ala Ile Thr 20 25 30 Phe Leu Val PheGly Met Leu Leu Leu Ile Ser Gly Ala Leu Phe Leu 35 40 45 Thr Leu Gly IlePro Gly Leu Ser Ala Ala Ile Ser Phe Gly Leu Gly 50 55 60 Ile Gly Leu SerAla Leu Gly Gly Val Leu Met Ile Ser Gly Leu Leu 65 70 75 80 Cys Leu LeuVal Lys Arg Glu Ile Pro Thr Val Arg Pro Glu Glu Ile 85 90 95 Pro Glu GlyVal Ser Leu Ala Pro Ser Glu Glu Pro Ala Leu Gln Ala 100 105 110 Ala GlnLys Thr Leu Ala Gln Leu Pro Lys Glu Leu Asp Gln Leu Asp 115 120 125 ThrAsp Ile Gln Glu Val Phe Ala Cys Leu Arg Lys Leu Lys Asp Ser 130 135 140Lys Tyr Glu Ser Arg Ser Phe Leu Asn Asp Ala Lys Lys Glu Leu Arg 145 150155 160 Val Phe Asp Phe Val Val Glu Asp Thr Leu Ser Glu Ile Phe Glu Leu165 170 175 Arg Gln Ile Val Ala Gln Glu Gly Trp Asp Leu Asn Phe Leu IleAsn 180 185 190 Gly Gly Arg Ser Leu Met Met Thr Ala Glu Ser Glu Ser LeuAsp Leu 195 200 205 Phe His Val Ser Lys Arg Leu Gly Tyr Leu Pro Ser GlyAsp Val Arg 210 215 220 Gly Glu Gly Leu Lys Lys Ser Ala Lys Glu Ile ValAla Arg Leu Met 225 230 235 240 Ser Leu His Cys Glu Ile His Lys Val AlaVal Ala Phe Asp Arg Asn 245 250 255 Ser Tyr Ala Met Ala Glu Lys Ala PheAla Lys Ala Leu Gly Ala Leu 260 265 270 Glu Glu Ser Val Tyr Arg Ser LeuThr Gln Ser Tyr Arg Asp Lys Phe 275 280 285 Leu Glu Ser Glu Arg Ala LysIle Pro Trp Asn Gly His Ile Thr Trp 290 295 300 Leu Arg Asp Asp Ala LysSer Gly Cys Ala Glu Lys Lys Leu Gly Met 305 310 315 320 Pro Arg Asn ValGly Arg Asn Leu Gly Lys Gln Ser Phe Gly 325 330 4 811 PRT Chlamydiapneumoniae 4 Met Gln Val Leu Leu Ser Pro Gln Leu Pro Pro Pro Pro Gln HisSer 1 5 10 15 Val Gly Ser Ile Ser Ser Pro Ser Lys Leu Arg Val Leu AlaIle Thr 20 25 30 Phe Leu Val Phe Gly Met Leu Leu Leu Ile Ser Gly Ala LeuPhe Leu 35 40 45 Thr Leu Gly Ile Pro Gly Leu Ser Ala Ala Ile Ser Phe GlyLeu Gly 50 55 60 Ile Gly Leu Ser Ala Leu Gly Gly Val Leu Met Ile Ser GlyLeu Leu 65 70 75 80 Cys Leu Leu Val Lys Arg Glu Ile Pro Thr Val Arg ProGlu Glu Ile 85 90 95 Pro Glu Gly Val Ser Leu Ala Pro Ser Glu Glu Pro AlaLeu Gln Ala 100 105 110 Ala Gln Lys Thr Leu Ala Gln Leu Pro Lys Glu LeuAsp Gln Leu Asp 115 120 125 Thr Asp Ile Gln Glu Val Phe Ala Cys Leu ArgLys Leu Lys Asp Ser 130 135 140 Lys Tyr Glu Ser Arg Ser Phe Leu Asn AspAla Lys Lys Glu Leu Arg 145 150 155 160 Val Phe Asp Phe Val Val Glu AspThr Leu Ser Glu Ile Phe Glu Leu 165 170 175 Arg Gln Ile Val Ala Gln GluGly Trp Asp Leu Asn Phe Leu Ile Asn 180 185 190 Gly Gly Arg Ser Leu MetMet Thr Ala Glu Ser Glu Ser Leu Asp Leu 195 200 205 Phe His Val Ser LysArg Leu Gly Tyr Leu Pro Ser Gly Asp Val Arg 210 215 220 Gly Glu Gly LeuLys Lys Ser Ala Lys Glu Ile Val Ala Arg Leu Met 225 230 235 240 Ser LeuHis Cys Glu Ile His Lys Val Ala Val Ala Phe Asp Arg Asn 245 250 255 SerTyr Ala Met Ala Glu Lys Ala Phe Ala Lys Ala Leu Gly Ala Leu 260 265 270Glu Glu Ser Val Tyr Arg Ser Leu Thr Gln Ser Tyr Arg Asp Lys Phe 275 280285 Leu Glu Ser Glu Arg Ala Lys Ile Pro Trp Asn Gly His Ile Thr Trp 290295 300 Leu Arg Asp Asp Ala Lys Ser Gly Cys Ala Glu Lys Lys Leu Arg Asp305 310 315 320 Ala Glu Glu Arg Trp Lys Lys Phe Arg Lys Ala Val Phe TrpVal Glu 325 330 335 Glu Asp Gly Gly Phe Asp Ile Asn Asn Leu Leu Gly AspTrp Gly Thr 340 345 350 Val Leu Asp Pro Tyr Arg Gln Glu Arg Met Asp GluIle Thr Phe His 355 360 365 Glu Leu Tyr Glu Lys Thr Thr Phe Leu Lys ArgLeu His Arg Lys Cys 370 375 380 Ala Leu Ala Lys Thr Thr Phe Glu Lys LysArg Ser Lys Lys Asn Leu 385 390 395 400 Gln Ala Val Glu Glu Ala Asn AlaArg Arg Leu Lys Tyr Val Arg Asp 405 410 415 Trp Tyr Asp Gln Glu Phe GlnLys Ala Gly Glu Arg Leu Glu Lys Leu 420 425 430 His Ala Leu Tyr Pro GluVal Ser Val Ser Ile Arg Glu Asn Lys Ile 435 440 445 Gln Glu Thr Arg SerAsn Leu Glu Lys Ala Tyr Glu Ala Ile Glu Glu 450 455 460 Asn Tyr Arg CysCys Val Arg Glu Gln Glu Asp Tyr Trp Lys Glu Glu 465 470 475 480 Glu LysArg Glu Ala Glu Phe Arg Glu Arg Gly Asn Lys Ile Leu Ser 485 490 495 ProGlu Glu Leu Glu Ser Ser Leu Glu Gln Phe Asp His Gly Leu Lys 500 505 510Asn Phe Ser Glu Lys Leu Met Glu Leu Glu Gly His Ile Leu Lys Leu 515 520525 Gln Lys Glu Ala Thr Ala Glu Val Glu Asn Lys Ile Leu Ser Asp Ala 530535 540 Glu Ser Arg Leu Glu Ile Val Phe Glu Asp Val Lys Glu Met Pro Cys545 550 555 560 Arg Ile Glu Glu Ile Glu Lys Thr Leu Arg Met Ala Glu LeuPro Leu 565 570 575 Leu Pro Thr Lys Lys Ala Phe Glu Lys Ala Cys Ser GlnTyr Asn Ser 580 585 590 Cys Ala Glu Met Leu Glu Lys Val Lys Pro Tyr CysLys Glu Ser Leu 595 600 605 Ala Tyr Val Thr Ser Lys Glu Arg Leu Val SerLeu Asp Glu Asp Leu 610 615 620 Arg Arg Ala Tyr Thr Glu Cys Gln Lys ArgPhe Gln Gly Asp Ser Gly 625 630 635 640 Leu Glu Ser Glu Val Arg Ala CysArg Glu Gln Leu Arg Glu Arg Ile 645 650 655 Gln Glu Phe Glu Thr Gln GlyLeu Asp Leu Val Glu Lys Glu Leu Leu 660 665 670 Cys Val Ser Ser Arg LeuArg Asn Thr Glu Cys Asp Cys Val Ser Gly 675 680 685 Val Lys Lys Glu AlaPro Pro Gly Lys Lys Phe Tyr Ala Gln Tyr Tyr 690 695 700 Asp Glu Ile TyrArg Val Arg Val Gln Ser Arg Trp Met Thr Met Ser 705 710 715 720 Glu ArgLeu Arg Glu Gly Val Gln Ala Cys Asn Lys Met Leu Lys Ala 725 730 735 GlyLeu Ser Glu Glu Asp Lys Val Leu Lys Glu Glu Glu Tyr Trp Leu 740 745 750Tyr Arg Glu Glu Arg Lys Asn Lys Glu Lys Arg Leu Val Gly Thr Lys 755 760765 Ile Val Ala Thr Gln Gln Arg Val Ala Ala Phe Glu Ser Ile Glu Val 770775 780 Pro Glu Ile Pro Glu Ala Pro Glu Glu Lys Pro Ser Leu Leu Asp Lys785 790 795 800 Ala Arg Ser Leu Phe Thr Arg Glu Asp His Ser 805 810 5810 PRT Chlamydia pneumoniae 5 Met Gln Val Leu Leu Ser Pro Gln Leu ProPro Pro Gln His Ser Val 1 5 10 15 Gly Ser Ile Ser Ser Pro Ser Lys LeuArg Val Leu Ala Ile Thr Phe 20 25 30 Leu Val Phe Gly Met Leu Leu Leu IleSer Gly Ala Leu Phe Leu Thr 35 40 45 Leu Gly Ile Pro Gly Leu Ser Ala AlaIle Ser Phe Gly Leu Gly Ile 50 55 60 Gly Leu Ser Ala Leu Gly Gly Val LeuMet Ile Ser Gly Leu Leu Cys 65 70 75 80 Leu Leu Val Lys Arg Glu Ile ProThr Val Arg Pro Glu Glu Ile Pro 85 90 95 Glu Gly Val Ser Leu Ala Pro SerGlu Glu Pro Ala Leu Gln Ala Ala 100 105 110 Gln Lys Thr Leu Ala Gln LeuPro Lys Glu Leu Asp Gln Leu Asp Thr 115 120 125 Asp Ile Gln Glu Val PheAla Cys Leu Arg Lys Leu Lys Asp Ser Lys 130 135 140 Tyr Glu Ser Arg SerPhe Leu Asn Asp Ala Lys Lys Glu Leu Arg Val 145 150 155 160 Phe Asp PheVal Val Glu Asp Thr Leu Ser Glu Ile Phe Glu Leu Arg 165 170 175 Gln IleVal Ala Gln Glu Gly Trp Asp Leu Asn Phe Leu Ile Asn Gly 180 185 190 GlyArg Ser Leu Met Met Thr Ala Glu Ser Glu Ser Leu Asp Leu Phe 195 200 205His Val Ser Lys Arg Leu Gly Tyr Leu Pro Ser Gly Asp Val Arg Gly 210 215220 Glu Gly Leu Lys Lys Ser Ala Lys Glu Ile Val Ala Arg Leu Met Ser 225230 235 240 Leu His Cys Glu Ile His Lys Val Ala Val Ala Phe Asp Arg AsnSer 245 250 255 Tyr Ala Met Ala Glu Lys Ala Phe Ala Lys Ala Leu Gly AlaLeu Glu 260 265 270 Glu Ser Val Tyr Arg Ser Leu Thr Gln Ser Tyr Arg AspLys Phe Leu 275 280 285 Glu Ser Glu Arg Ala Lys Ile Pro Trp Asn Gly HisIle Thr Trp Leu 290 295 300 Arg Asp Asp Ala Lys Ser Gly Cys Ala Glu LysLys Leu Arg Asp Ala 305 310 315 320 Glu Glu Arg Trp Lys Lys Phe Arg LysAla Val Phe Trp Val Glu Glu 325 330 335 Asp Gly Gly Phe Asp Ile Asn AsnLeu Leu Gly Asp Trp Gly Thr Val 340 345 350 Leu Asp Pro Tyr Arg Gln GluArg Met Asp Glu Ile Thr Phe His Glu 355 360 365 Leu Tyr Glu Lys Thr ThrPhe Leu Lys Arg Leu His Arg Lys Cys Ala 370 375 380 Leu Ala Lys Thr ThrPhe Glu Lys Lys Arg Ser Lys Lys Asn Leu Gln 385 390 395 400 Ala Val GluGlu Ala Asn Ala Arg Arg Leu Lys Tyr Val Arg Asp Trp 405 410 415 Tyr GlyGln Glu Phe Gln Lys Ala Gly Glu Arg Leu Glu Lys Leu His 420 425 430 AlaLeu Tyr Pro Glu Val Ser Val Ser Ile Arg Glu Asn Lys Ile Gln 435 440 445Glu Thr Arg Ser Asn Leu Glu Lys Ala Tyr Glu Ala Ile Glu Glu Asn 450 455460 Tyr Arg Cys Cys Val Arg Glu Gln Glu Asp Tyr Trp Lys Glu Glu Glu 465470 475 480 Lys Arg Glu Ala Glu Phe Arg Glu Arg Gly Asn Lys Ile Leu SerPro 485 490 495 Glu Glu Leu Glu Ser Ser Leu Glu Gln Phe Asp His Gly LeuLys Asn 500 505 510 Phe Ser Glu Lys Leu Met Glu Leu Glu Gly His Ile LeuLys Leu Gln 515 520 525 Lys Glu Ala Thr Ala Glu Val Glu Asn Lys Ile LeuSer Asp Ala Glu 530 535 540 Ser Arg Leu Glu Ile Val Phe Glu Asp Val LysGlu Met Pro Cys Arg 545 550 555 560 Ile Glu Glu Ile Glu Lys Thr Leu ArgMet Ala Glu Leu Pro Leu Leu 565 570 575 Pro Thr Lys Lys Ala Phe Glu LysAla Cys Ser Gln Tyr Asn Ser Cys 580 585 590 Ala Glu Met Leu Glu Lys ValLys Pro Tyr Cys Lys Glu Ser Leu Ala 595 600 605 Tyr Val Thr Ser Lys GluArg Leu Val Ser Leu Asp Glu Asp Leu Arg 610 615 620 Arg Ala Tyr Thr GluCys Gln Lys Arg Phe Gln Gly Asp Ser Gly Leu 625 630 635 640 Glu Ser GluVal Arg Ala Cys Arg Glu Gln Leu Arg Glu Arg Ile Gln 645 650 655 Glu PheGlu Thr Gln Gly Leu Asp Leu Val Glu Lys Glu Leu Leu Cys 660 665 670 ValSer Ser Arg Leu Arg Asn Thr Glu Cys Asp Cys Val Ser Gly Val 675 680 685Lys Lys Glu Ala Pro Pro Gly Lys Lys Phe Tyr Ala Gln Tyr Tyr Asp 690 695700 Glu Ile Tyr Arg Val Arg Val Gln Ser Arg Trp Met Thr Met Ser Glu 705710 715 720 Arg Leu Arg Glu Gly Val Gln Ala Cys Asn Lys Met Leu Lys AlaGly 725 730 735 Leu Ser Glu Glu Asp Lys Val Leu Lys Glu Glu Glu Tyr TrpLeu Tyr 740 745 750 Arg Glu Glu Arg Lys Asn Lys Glu Lys Arg Leu Val GlyThr Lys Ile 755 760 765 Val Ala Thr Gln Gln Arg Val Ala Ala Phe Glu SerIle Glu Val Pro 770 775 780 Glu Ile Pro Glu Ala Pro Glu Glu Lys Pro SerLeu Leu Asp Lys Ala 785 790 795 800 Arg Ser Leu Phe Thr Arg Glu Asp HisSer 805 810 6 610 PRT Chlamydia pneumoniae 6 Met Gln Val His Val Ser ProThr Thr Ala Thr Pro Asp His Ser Val 1 5 10 15 Gly Ala Thr Ser Trp GlnPro Lys Leu Arg Ile Leu Thr Ile Thr Phe 20 25 30 Leu Val Leu Gly Val LeuLeu Leu Ile Ser Gly Ala Leu Phe Leu Thr 35 40 45 Leu Gly Val Pro Gly LeuAla Ala Gly Leu Ser Phe Gly Leu Gly Ile 50 55 60 Gly Leu Ser Ala Leu GlyGly Val Leu Val Val Ser Gly Leu Leu Phe 65 70 75 80 Phe Leu Ile Arg ArgGly Val Ser Lys Val Arg Pro Glu Glu Ile Pro 85 90 95 Val Thr Pro Ser HisGlu Ala Gln Lys Ile Leu Cys Gln Leu Pro Gln 100 105 110 Glu Leu Asp GlnLeu Asp Thr Ser Ile Gln Glu Val Val Ser Cys Leu 115 120 125 Gly Lys LeuLys Asp Leu Lys Tyr Glu Asp Gln Gly Leu Leu Thr Glu 130 135 140 Val GlnGlu Lys Leu Arg Val Phe Asp Phe Val Arg Lys Asp Met Val 145 150 155 160Thr Glu Phe Leu Glu Leu Gln Gln Val Val Ala Gln Glu Gly Gln Phe 165 170175 Leu Asp Tyr Leu Ile Asn Gln Val Gln Ser Ile Ser His Lys Leu Phe 180185 190 Val Pro Asp Val Asn Ile Gly Ala His Leu Ala Glu Leu Cys Gly Tyr195 200 205 Leu Pro Ser Gly Asp Val Arg Val Glu Arg Leu Lys Arg Ser AlaArg 210 215 220 Gln Val Val Asp Arg Phe Met Arg Val Thr Cys Asp Thr ArgLys Val 225 230 235 240 Ala Met Ala Phe Asp Glu Asn Ala Cys Gly Val AlaLys Asn Ala Phe 245 250 255 Asp Lys Ala Phe Gly Ala Leu Glu Glu Cys ValTyr Lys Ser Leu Thr 260 265 270 Glu Ser Tyr Arg Glu Ala Phe Tyr Glu TyrGlu Lys Ala Lys Ile Leu 275 280 285 Arg Asn Glu Asp Val Glu Trp Leu GlnAsp Lys Asn Lys Ser Ala Arg 290 295 300 Ala Glu Gln Arg Phe Arg Glu ValLys Asp Arg Trp Glu Asp Leu Lys 305 310 315 320 Glu Thr Val Phe Trp ValLys Glu Asn Gly Cys Ile Asp Leu Glu Val 325 330 335 Leu Thr Ala Val GlyGly Trp Pro Asp Arg Gly Pro Glu His Leu Ile 340 345 350 Pro Glu Lys ArgArg Asn Lys Val Met Ser His Lys Leu Trp Glu Ala 355 360 365 Thr Met ArgMet Lys Gly Ala Glu Gly Thr Tyr Ser Val Ala Arg Val 370 375 380 Ala PheGlu Lys Asp Gly Ser Arg Lys Asn Gln Lys Lys Phe Gln Glu 385 390 395 400Lys Thr Lys Glu Trp Leu Arg Cys Leu Lys Asp Leu His Asp Gln Glu 405 410415 Cys His Arg Ala Arg Glu Arg Leu Ala Glu Leu Glu Ala Leu Tyr Pro 420425 430 Glu Val Ser Val Ser Val Val Glu Thr Glu Arg Glu Thr Lys Phe Lys435 440 445 Leu Glu Thr Ala Tyr Gly Asn Leu Glu Glu Arg Tyr Gln Ser ValVal 450 455 460 Arg Asp Gln Glu Asp Tyr Trp Lys Glu Glu Glu Asn Lys GluAla Glu 465 470 475 480 Phe Arg Glu Lys Gly Thr Lys Val Arg Ser Pro GluGlu Val Val Glu 485 490 495 Tyr Leu Gln Ile Leu Glu Asn Leu Ser Glu AspCys Ser Lys Gln Leu 500 505 510 Thr Ile Ala Glu Val Val Val Leu Gly ValGlu Leu Glu Ala Thr Ala 515 520 525 Glu Phe Glu Tyr Thr Ile Leu Ser AspAla Ala Asn Arg Leu Lys Val 530 535 540 Leu Cys Glu Asp Ile Glu Asp IleLeu Pro Arg Val Glu Glu Ile Glu 545 550 555 560 Ile Met Leu Arg Ile AlaGlu Leu Pro Phe Leu Pro Ile Lys Gln Ala 565 570 575 Phe Thr Lys Ala PheLeu Gln Tyr Asn Ser Cys Lys Asp Lys Leu Ala 580 585 590 Lys Val Glu ProTyr Cys Gln Glu Ser Val Asp Tyr Lys Ser Gly Phe 595 600 605 Arg Val 6107 770 PRT Chlamydia pneumoniae 7 Met Gln Val His Val Ser Pro Gln Leu ProPro Asp His Ser Val Gly 1 5 10 15 Ala Thr Ser Trp Gln Pro Lys Leu ArgIle Leu Thr Ile Thr Phe Leu 20 25 30 Val Leu Gly Val Leu Leu Leu Ile SerGly Ala Leu Phe Leu Thr Leu 35 40 45 Gly Val Pro Gly Leu Ala Ala Gly LeuSer Phe Gly Leu Gly Ile Gly 50 55 60 Leu Ser Ala Leu Gly Gly Val Leu ValVal Ser Gly Leu Leu Phe Phe 65 70 75 80 Leu Ile Arg Arg Gly Val Ser LysVal Arg Pro Glu Glu Ile Pro Val 85 90 95 Thr Pro Ser His Glu Ala Gln LysIle Leu Cys Gln Leu Pro Gln Glu 100 105 110 Leu Asp Gln Leu Asp Thr SerIle Gln Glu Val Val Ser Cys Leu Gly 115 120 125 Lys Leu Lys Asp Leu LysTyr Glu Asp Gln Gly Leu Leu Thr Glu Val 130 135 140 Gln Glu Lys Leu ArgVal Phe Asp Phe Val Arg Lys Asp Met Val Thr 145 150 155 160 Glu Phe LeuGlu Leu Gln Gln Val Val Ala Gln Glu Gly Gln Phe Leu 165 170 175 Asp TyrLeu Ile Asn Gln Val Gln Ser Ile Ser His Lys Leu Phe Val 180 185 190 ProAsp Val Asn Ile Gly Ala His Leu Ala Glu Leu Cys Gly Tyr Leu 195 200 205Pro Ser Gly Asp Val Arg Val Glu Arg Leu Lys Arg Ser Ala Arg Gln 210 215220 Val Val Asp Arg Phe Met Arg Val Thr Cys Asp Thr Arg Lys Val Ala 225230 235 240 Met Ala Phe Asp Glu Asn Ala Cys Gly Val Ala Lys Asn Ala PheAsp 245 250 255 Lys Ala Phe Gly Ala Leu Glu Glu Cys Val Tyr Lys Ser LeuThr Glu 260 265 270 Ser Tyr Arg Glu Ala Phe Tyr Glu Tyr Glu Lys Ala LysIle Leu Arg 275 280 285 Asn Glu Asp Val Glu Trp Leu Gln Asp Lys Asn LysSer Ala Arg Ala 290 295 300 Glu Gln Arg Phe Arg Glu Val Lys Asp Arg TrpGlu Asp Leu Lys Glu 305 310 315 320 Thr Val Phe Trp Val Lys Glu Asn GlyCys Ile Asp Leu Glu Val Leu 325 330 335 Thr Ala Val Gly Gly Trp Pro AspArg Gly Pro Glu His Leu Ile Pro 340 345 350 Glu Lys Arg Arg Asn Lys ValMet Ser His Lys Leu Trp Glu Ala Thr 355 360 365 Met Arg Met Lys Gly AlaGlu Gly Thr Tyr Ser Val Ala Arg Val Ala 370 375 380 Phe Glu Lys Asp GlySer Arg Lys Asn Gln Lys Lys Phe Gln Glu Lys 385 390 395 400 Thr Lys GluTrp Leu Arg Cys Leu Lys Asp Leu His Asp Gln Glu Cys 405 410 415 His ArgAla Arg Glu Arg Leu Ala Glu Leu Glu Ala Leu Tyr Pro Glu 420 425 430 ValSer Val Ser Val Val Glu Thr Glu Arg Glu Thr Lys Phe Lys Leu 435 440 445Glu Thr Ala Tyr Gly Asn Leu Glu Glu Arg Tyr Gln Ser Val Val Arg 450 455460 Asp Gln Glu Asp Tyr Trp Lys Glu Glu Glu Asn Lys Glu Ala Glu Phe 465470 475 480 Arg Glu Lys Gly Thr Lys Val Arg Ser Pro Glu Glu Val Val GluTyr 485 490 495 Leu Gln Ile Leu Glu Asn Leu Leu Glu Asp Cys Ser Lys GlnLeu Thr 500 505 510 Ile Ala Glu Val Val Val Leu Gly Val Glu Leu Glu AlaThr Ala Glu 515 520 525 Phe Glu Tyr Thr Ile Leu Ser Asp Ala Ala Asn ArgLeu Lys Val Leu 530 535 540 Cys Glu Asp Ile Glu Asp Ile Leu Pro Arg ValGlu Glu Ile Glu Ile 545 550 555 560 Met Leu Arg Ile Ala Glu Leu Pro PheLeu Pro Ile Lys Gln Ala Phe 565 570 575 Thr Lys Ala Phe Leu Gln Tyr AsnSer Cys Lys Asp Lys Leu Ala Lys 580 585 590 Val Glu Pro Tyr Cys Gln GluSer Val Asp Tyr Arg Arg Asn Lys Glu 595 600 605 Arg Phe Gln Ser Leu AsnGln Asp Leu Gln Asn Val Tyr Gln Glu Cys 610 615 620 Gln Lys Ala Thr GlyLeu Glu Ser Glu Val Ser Ala Tyr Arg Asp His 625 630 635 640 Leu Arg GluGln Ile Thr Glu Phe Glu Thr Gln Gly Leu Asp Val Ile 645 650 655 Lys GluGlu Leu Leu Phe Val Ser Ser Thr Leu Lys Ser Lys Leu Ser 660 665 670 TyrAsp Pro Leu Ile Ala Asp Ile Pro Cys Met Lys Phe Tyr Glu Glu 675 680 685Tyr Tyr Asp Gly Ile Asp Lys Ala Arg Val Gln Ser Arg Trp Leu Glu 690 695700 Lys Ser Glu Arg Tyr Arg Lys Ala Lys Lys Gly Phe Gln Glu Met Leu 705710 715 720 Lys Glu Gly Leu Phe Lys Glu Asp Gln Ala Leu Lys Lys Ala GluTyr 725 730 735 Arg Leu Leu Arg Glu Lys Arg Met Asn Lys Glu Lys Leu LeuIle Cys 740 745 750 Asn Lys Ile Glu Ala Ala Gln Gln Arg Val Gln Glu PheGly Pro Ser 755 760 765 Asp Ser 770 8 771 PRT Chlamydia pneumoniaeVARIANT 537 Xaa = Any Amino Acid 8 Met Gln Val His Val Ser Pro Thr ThrAla Thr Pro Asp His Ser Val 1 5 10 15 Gly Ala Thr Ser Trp Gln Pro LysLeu Arg Ile Leu Thr Ile Thr Phe 20 25 30 Leu Val Leu Gly Val Leu Leu LeuIle Ser Gly Ala Leu Phe Leu Thr 35 40 45 Leu Gly Val Pro Gly Leu Ala AlaGly Leu Ser Phe Gly Leu Gly Ile 50 55 60 Gly Leu Ser Ala Leu Gly Gly ValLeu Val Val Ser Gly Leu Leu Phe 65 70 75 80 Phe Leu Ile Arg Arg Gly ValSer Lys Val Arg Pro Glu Glu Ile Pro 85 90 95 Val Thr Pro Ser His Glu AlaGln Lys Ile Leu Cys Gln Leu Pro Gln 100 105 110 Glu Leu Asp Gln Leu AspThr Ser Ile Gln Glu Val Val Ser Cys Leu 115 120 125 Gly Lys Leu Lys AspLeu Lys Tyr Glu Asp Gln Gly Leu Leu Thr Glu 130 135 140 Val Gln Glu LysLeu Arg Val Phe Asp Phe Val Arg Lys Asp Met Val 145 150 155 160 Thr GluPhe Leu Glu Leu Gln Gln Val Val Ala Gln Glu Gly Gln Phe 165 170 175 LeuAsp Tyr Leu Ile Asn Gln Val Gln Ser Ile Ser His Lys Leu Phe 180 185 190Val Pro Asp Val Asn Ile Gly Ala His Leu Ala Glu Leu Cys Gly Tyr 195 200205 Leu Pro Ser Gly Asp Val Arg Val Glu Arg Leu Lys Arg Ser Ala Arg 210215 220 Gln Val Val Asp Arg Phe Met Arg Val Thr Cys Asp Thr Arg Lys Val225 230 235 240 Ala Met Ala Phe Asp Glu Asn Ala Cys Gly Val Ala Lys AsnAla Phe 245 250 255 Asp Lys Ala Phe Gly Ala Leu Glu Glu Cys Val Tyr LysSer Leu Thr 260 265 270 Glu Ser Tyr Arg Glu Ala Phe Tyr Glu Tyr Glu LysAla Lys Ile Leu 275 280 285 Arg Asn Glu Asp Val Glu Trp Leu Gln Asp LysAsn Lys Ser Ala Arg 290 295 300 Ala Glu Gln Arg Phe Arg Glu Val Lys AspArg Trp Glu Asp Leu Lys 305 310 315 320 Glu Thr Val Phe Trp Val Lys GluAsn Gly Cys Ile Asp Leu Glu Val 325 330 335 Leu Thr Ala Val Gly Gly TrpPro Asp Arg Gly Pro Glu His Leu Ile 340 345 350 Pro Glu Lys Arg Arg AsnLys Val Met Ser His Lys Leu Trp Glu Ala 355 360 365 Thr Met Arg Met LysGly Ala Glu Gly Thr Tyr Ser Val Ala Arg Val 370 375 380 Ala Phe Glu LysAsp Gly Ser Arg Lys Asn Gln Lys Lys Phe Gln Glu 385 390 395 400 Lys ThrLys Glu Trp Leu Arg Cys Leu Lys Asp Leu His Asp Gln Glu 405 410 415 CysHis Arg Ala Arg Glu Arg Leu Ala Glu Leu Glu Ala Leu Tyr Pro 420 425 430Glu Val Ser Val Ser Val Val Glu Thr Glu Arg Glu Thr Lys Phe Lys 435 440445 Leu Glu Thr Ala Tyr Gly Asn Leu Glu Glu Arg Tyr Gln Ser Val Val 450455 460 Arg Asp Gln Glu Asp Tyr Trp Lys Glu Glu Glu Asn Lys Glu Ala Glu465 470 475 480 Phe Arg Glu Lys Gly Thr Lys Val Arg Ser Pro Glu Glu ValVal Glu 485 490 495 Tyr Leu Gln Ile Leu Glu Asn Leu Leu Glu Asp Cys SerLys Gln Leu 500 505 510 Thr Ile Ala Glu Val Val Val Leu Gly Val Glu LeuGlu Ala Thr Ala 515 520 525 Glu Phe Glu Tyr Thr Ile Leu Ser Xaa Ala AlaAsn Arg Leu Lys Val 530 535 540 Leu Cys Glu Asp Ile Glu Asp Ile Leu ProArg Val Glu Glu Ile Glu 545 550 555 560 Ile Met Leu Arg Ile Ala Glu LeuPro Phe Leu Pro Ile Lys Gln Ala 565 570 575 Phe Thr Lys Ala Phe Leu GlnTyr Asn Ser Cys Lys Asp Lys Leu Ala 580 585 590 Lys Val Glu Pro Tyr CysGln Glu Ser Val Asp Tyr Arg Arg Asn Lys 595 600 605 Glu Arg Phe Gln SerLeu Asn Gln Asp Leu Gln Asn Val Tyr Gln Glu 610 615 620 Cys Gln Lys AlaThr Gly Leu Glu Ser Glu Val Ser Ala Tyr Arg Asp 625 630 635 640 His LeuArg Glu Gln Ile Thr Glu Phe Glu Thr Gln Gly Leu Asp Val 645 650 655 IleLys Glu Glu Leu Leu Phe Val Ser Ser Thr Leu Lys Ser Lys Leu 660 665 670Ser Tyr Asp Pro Leu Ile Ala Asp Ile Pro Cys Met Lys Phe Tyr Glu 675 680685 Glu Tyr Tyr Asp Gly Ile Asp Lys Ala Arg Val Gln Ser Arg Trp Leu 690695 700 Glu Lys Ser Glu Arg Tyr Arg Lys Ala Lys Lys Gly Phe Gln Glu Met705 710 715 720 Leu Lys Glu Gly Leu Phe Lys Glu Asp Gln Ala Leu Lys LysAla Glu 725 730 735 Tyr Arg Leu Leu Arg Glu Lys Arg Met Asn Lys Glu LysLeu Leu Ile 740 745 750 Cys Asn Lys Ile Glu Ala Ala Gln Gln Arg Val GlnGlu Phe Gly Pro 755 760 765 Ser Asp Ser 770

What is claimed is:
 1. A method for determining the presence of a strainof chlamydia in a biological sample, said method comprising the stepsof: (a) providing a biological sample; and (b) determining the presenceof a polynucleotide containing a polymorphic repetitive sequence in apolynucleotide in said sample, said polymorphic repetitive sequenceassociated with a first strain of chlamydia and not associated with asecond strain of chlamydia, wherein the presence of the polynucleotidecontaining said polymorphic repetitive sequence indicates the presenceof said first strain of chlamydia.
 2. The method of claim 1, whereinsaid chlamydia is C. pneuomoniae, C. trachomatis, C. psittaci, C.muridarum.
 3. The method of claim 1, wherein said first strain is C.pneumoniae strain CWL-029, C. pneumoniae strain AR 39, C. pneumoniaestrain J138, or C. trachomatis strain D/UW-3/Cx.
 4. The method of claim1, wherein said polymorphic repetitive sequence is a simple sequencerepeat, a tandem repeat, or a large repeat.
 5. The method of claim 1,wherein said sample is a biopsy sample, blood, serum, peripheral bloodmononuclear cells, cerebrospinal fluid, urine, nasal secretion, orsaliva.
 6. The method of claim 1, wherein said determining of thepresence of a polymorphic repetitive sequence comprises a polynucleotidedetection step.
 7. The method of claim 6, wherein said polynucleotidedetection step comprises amplification of polynucleotide molecules thatcontain a polymorphic repetitive sequence.
 8. A method for determiningthe presence of a plurality of strains of chlamydiae in a biologicalsample, said method comprising the steps of: (a) providing a biologicalsample; and (b) determining the presence in said biological sample of aplurality of polynucleotides, each containing a polymorphic repetitivesequence, wherein each polymorphic repetitive sequence is associatedwith one strain of chlamydia and not associated with another strain ofchlamydiae, and wherein the presence of a polymorphic repetitivesequence indicates the presence of said strain of chlamydia associatedwith said polymorphic repetitive sequence, and absence of a polymorphicrepetitive sequence indicates absence of said strain of chlamydiaassociated with said polymorphic repetitive sequence.
 9. The method ofclaim 8, wherein said chlamydia is C. pneuomoniae, C. trachomatis, C.psittaci, C. muridarum.
 10. The method of claim 8, wherein said strainis C. pneumoniae strain CWL-029, C. pneumoniae strain AR 39, C.pneumoniae strain J138, or C. trachomatis strain D/UW-3/Cx.
 11. Themethod of claim 8, wherein said polymorphic repetitive sequence is asimple sequence repeat, a tandem repeat, or a large repeat.
 12. Themethod of claim 8, wherein said sample is a biopsy sample, blood, serum,peripheral blood mononuclear cells, cerebrospinal fluid, urine, nasalsecretion, or saliva.
 13. The method of claim 8, wherein saiddetermining of the presence of a polymorphic repetitive sequencecomprises a polynucleotide detection step.
 14. The method of claim 13,wherein said polynucleotide detection step comprises amplification ofpolynucleotide molecules that contain a polymorphic repetitive sequence.15. A method for treating a chlamydial infection in a patient, saidmethod comprising the steps of: (a) providing a biological sample fromthe patient; (b) determining the presence in said biological sample of aplurality of polynucleotides containing a polymorphic repetitivesequence, wherein each polymorphic repetitive sequence is associatedwith one strain of chlamydia and not associated with another strain ofchlamydia, and wherein the presence of a polymorphic repetitive sequenceindicates the presence of said strain of chlamydia associated with saidpolymorphic repetitive sequence, and absence of a polymorphic repetitivesequence indicates the absence of said strain of chlamydia associatedwith said polymorphic repetitive sequence; and (c) administering to saidpatient anti-chlamydial agents that are effective against said strainsof chlamydiae that are present in the biological sample.
 16. The methodof claim 15, wherein said chlamydia is C. pneuomoniae, C. trachomatis,C. psittaci, C. muridarum.
 17. The method of claim 15, wherein saidstrain is C. pneumoniae strain CWL-029, C. pneumoniae strain AR 39, C.pneumoniae strain J138, or C. trachomatis strain D/UW-3/Cx.
 18. Themethod of claim 15, wherein said polymorphic repetitive sequence is asimple sequence repeat, a tandem repeat, or a large repeat.
 19. Themethod of claim 15, wherein said sample is a biopsy sample, blood,serum, peripheral blood mononuclear cells, cerebrospinal fluid, urine,nasal secretion, or saliva.
 20. The method of claim 15, wherein saiddetermining of the presence of a polymorphic repetitive sequencecomprises a polynucleotide detection step.
 21. The method of claim 20,wherein said polynucleotide detection step comprises amplification ofpolynucleotide molecules that contain a polymorphic repetitive sequence.22. A purified polypeptide that is substantially identical to a POMP2polypeptide selected from SEQ ID NO: 3, SEQ ID NO: 4, and SEQ ID NO: 5,or a POMP4 polypeptide selected from SEQ ID NO: 6, SEQ ID NO: 7, and SEQID NO:
 8. 23. A purified polynucleotide encoding a polypeptide that issubstantially identical to a POMP2 polypeptide selected from SEQ ID NO:3, SEQ ID NO: 4, and SEQ ID NO: 5, or a POMP4 polypeptide selected fromSEQ ID NO: 6, SEQ ID NO: 7, and SEQ ID NO:
 8. 24. A method of immunizinga subject against an infection of C. pneumoniae, said method comprisingadministering to said subject a purified POMP polypeptide or animmunogenic fragment thereof in an amount sufficient to induce an immuneresponse to said POMP polypeptide or fragment thereof, wherein saidimmune response immunizes the subject against an infection of C.pneumoniae.
 25. An isolated antibody that specifically binds a POMPpolypeptide of claim 22 or a fragment thereof.
 26. A method of producingan immune response in an animal, said method comprising immunizing theanimal with an effective amount of a POMP polypeptide, or immunogenicfragment thereof.
 27. The method of claim 26, wherein said POMPpolypeptide is a POMP2 or POMP4 polypeptide.